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A convenient and easy method of traffic interaction 
study, for the determination of various traffic parameters, 
is the analysis of video recorded data. A special Video 
Instrumentaion System mounted on Maruti van (as test vehicle) 
has been used to record 3D scenes of traffic movement, on 
various roads, onto video cassettes. It is in this form that 
the data iis available and the analysis of this recorded data 
is being done manually using hardware, and software. 

A detailed study of Artificial Neural Networks and Image 
Processing has been carried out and a strategy chalked out to 
make an attempt to replace the process of manual data 
processing by automatized image processing because artifical 
neural networks, in association with a good vision system, 
may prove to be excellent for 3D visual scene analysis and 
may reduce the time of analysis considerably. 

As a first step towards this automatization, an • attempt 
has been made for vehicle shape recognition using Artficial 
Neural Networks, which may be used for vehicle dynamics later 
on. A software on artificial neural network has been 


developed which has been trained by a number of patterns 



(corresponding to different views of a vehicle). These 


patterns have to be in the language of computers i.e., in the 
form of numbers (nxn matrices) which are obtained by analog 
to digital conversion of the 2D picture images (of the 3 D 
scenes) of different views of the vehicle. The different 
views of the vehicle have been obtained by video recording 
using the video instrumentation system fixed in a Maruti van. 

The artificial neural network is a means to non 
programmed adaptive information processing. The network is 
trained by some known patterns so that it may develop itself 
into such a state so as to recognize similar patterns. 
Supervised training has been used in the software wherein the 
outputs corresponding to the input patterns are known. The 
implementation of the software needs consideration of several 
aspects to avoid practical problems. Even a 8x8 input matrix 
(i.e., 64 inputs) require a large memory and a 512x512 matrix 
may require a memory of 100 MB or higher. Moreover a large 
CPU time is required to bring down the error to a value 
within limits. 

The error after training has been found to be 
sufficiently low and is a healthy sign towards the future use 
of artificial neural networks for image processing. The 
training of the network by these patterns consumes quite a 


large CPU time, requires a large number of iterations and has 



a large memory requirement. The higher is the order of the 
matrices (patterns), the more is the memory requirement and 
this imposes certain restrictions on the order of matrices 
used thus increasing the error which might, otherwise, have 
been even lesser. 
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CHAPTER 1 


INTRODUCTION 


1.1 GENERAL 

With fast increasing traffic volumes in the wake of all 
round development, the need for upgrading and enhancing the 
transportation network makes itself strongly felt making 
substantial progress on limited national resourses. A very 
salutory effect of this constraint of resourses has been the 
focussing of attention on the need for accurate techno 
economic analysis of various design alternatives so as to 
attain maximum returns, for example from the highway sector, 
in terms of minimum transportation cost. For fulfiling the 
need for accurate and quick analysis, it is quite natural 
that researchers began to search for automatized methods and 
this is how Artificial Intelligence (AI) and Artificial 
Neural Networks (ANNs) began to be looked upon as effective 
tools for this automatized analysis in the field of 
Transportation Systems also. To be more precise, presently, 
our main intention is the use of AI and ANN in the field of 
Image Processing for determination of various parameters, in 


an automatized manner so that this may help us in the design 



of automatized intersection controls, automatized accident 


warning systems etc. This will also help fulfill the urgent 
need for traffic simulation modelling for Indian conditions. 
In fact a model developed at National Road and Traffic 
Research Institute of Sweden was considered to be the most 
suitable for modification to suit the Indian conditions. 
There was an agreement between the Government of India 
(Ministry of Shipping and Transport) and Swedish 
International Development Agency according to which VTI and 
CRRI were to carry out this study and, boastfully on the 
Indian side I.I.T. Kanpur was assigned the analytical aspect 
and that is what Image processing is being used for. 

Now, before moving into the details of Image processsing 
and Pattern Recognition phenomena, we must have an insight 
into the fields of AI and ANN so that we are able to make out 
how they can be used for automatised image processing and 
parameter determination. 

1.2 ARTIFICIAL INTELLIGENCE 

As the name signifies and what is a layman’s 
understanding of this term, it is something that is a replica 
of natural intelligence. It has been a goal of science and 
engineering to develop intelligent machines for many decades. 


These machines were envisioned to perform all tedious and 



cumbersome tasks so that we may enjoy a more fruitful and 


enriched life. AI is a branch of computer science concerned 
with the study and creation of computer systems that exhibit 
some form of intelligence, systems that learn new concepts 
and tasks, systems that can reason and draw useful 
conclusions, systems that can understand a natural language 
or perceive or comprehend a visual scene, and systems that 
perform other types of feats that require human type of 
intelligence . 

Intelligence is not only the ability to exercise thought 
and reason as defined in the dictionaries. It, in fact, 
embodies all of the knowledge and feats, both conscious and 
unconscious which we have acquired through study and 
experience. It is the integrated sum of all those feats which 
give us the ability to recogize a face not seen for 30 years 
or more or gives us the ability to send rockets to the moon. 
It is in fact those capabilities which set Homo sapiens apart 
from other living beings 

Can we ever expect to build systems which have these 
chracteristics? Yes, of course, is the answer. Systems have 
already been developed to perform many types of intelligent 
tasks and expectations are high for near future deveopment of 
even more impressive ones. Now we have systems which can: 

(i) learn from examples, from being told, from past related 



experiences . 


(ii) solve complex problems in scheduling, optimization, 
planning of miliatry strategies, digonosing deseases. 

(iii) see well enough to "recognize" objects from 

photographs, video cameras and other [censors . 

i/' ^ 

(iv) understand large parts of natural language. 

But we still have not been able to produce co-ordinated 
autonomous systems which possess the abilities of even a 
three year old child which include ability to : 

* recognize and remember diverse objects in a scene. 

* learn new sounds and associate them with objects and 
concepts 

* adapt readily to many diverse situation 

The above mentioned inabilities are in fact the 
challenges facing AI researchers. 

Precisely, in AI the goal is to develop working computer 
systems that are truly capable of performing tasks that 
require high levels of intelligence. A better understanding 
of AI is gained by looking at the component areas of study 
that make up the whole. These are: 

* Robotics 

* Memory organization 

* Knowledge representation 


* Storage and recall 



Learning models 


♦ 

* Inference techniques 

* Commonsense reasoning 

* Understanding natural language 

* Pattern recognition 

* Machine vision methods 

* Search and matching 

* Speech recognition and synthesis 
Importance of AI : 

AI may be one of the most important developments of this 
century and countries leading in this field will emerge as 
the dominant economic powers of the world. Japanese were the 
first to demonstrate their commitment to this field when they 
launched a very ambitious program in AI in October 1981, 
called as Fifth Generation which calls for implementation of 
a 10 year plan to develop intelligent supercomputers, with a 
combined budget of about 10 bilion dollars. If they succeed 
their position as a leading economic power is assured. 

Reasearches in the field of AI are also in progress in 
Britain, France, U.S. and one thing is clear that the future 
of a country is closely tied to the commitment it is willing 
to make in funding research programs in AI . 

Fields closely related to AI are Mehanical Engineering, 
Electrical Engineering, Linguistics, Psychology, Cognitive 



Sciences, Philosphy, and Robotics. 


Applications of AI have been proven in Civil 
Engineering, Defense, Chemistry, Biology, Banking, Economics, 
Manufacturing, Law , Medicine, Aerospace etc. 


1.3 ARTIFICIAL NEURAL NETWORKS 

Artificial Neural Networks (ANNs) are a part of AI , 
rather complementary to it. Imagine a computer that learns 
wherein information is fed into it alongwith examples of the 
conclusions it should be reaching or feedback on how it is 
doing or the machine may even be left to its own devices. The 
information processing system of the machine called the 
Artificial Neural Network involves a non algorithmic approach 
wherein the computer simply runs through the material again 
and again making myriads of mistakes but learning from them 
untill finally it gets itself into proper shape to carry out 
the task successfully. Such behaviour of the ANNs are quite 
human and their design is inspired by the stucture of the 
human brain and working of the brain cells, the neurons as 
they are called and hence the name. 

Neurocomputing : 

It is the Engineering discipline concerned with non 


programmed adaptive information processing systems: 


the 



neural networks that develop associations between objects in 


response to their environment. It is a fundamentally new and 
different information processing paradigm: the first 
alternative to algorithmic programming. Its application may 
reduce development costs and time often by an order of 
magnitude. It does not however replace algorithmic 
programming, it being in a state of infancy and applicable to 
only certain types of problems. It is suspected that 
neurocomputing and algorithmic programming may be conceptualy 
incompatible . 

In fact, ANNj^are biologically inspired i.e., they are 
composed of elements that perform in a manner that is 
analogous to the most elementary function of the biological 
neuron, the brain cell. These elements are then organised in 
a way that may (or may not be) related to the anatomy of the 
brain. Despite this superficial resemblence, ANNs exhibit a 
surprising number of the brain characterstics . For example, 
they learn from experience, generalise from previous examples 
to new ones and abstract essential characterstics from inputs 
containing irrelevant data. Inspite of all this, it can not 
be suggested that ANN will soon duplicate the functions of 
human brain. The actual intelligence exhibited by the most 
sophisticated ANN is below the level of a tapeworm. This 


reality should be kept in mind to check over enthusiasm. 



However it is equally incorrect to ignore the surprisingly 


brainlike performance of certain ANNs. 

1.4 PROBLEM STATEMENT AND OBJECT OF STUDY 

The manual image processing which is currently being 
done is a tedious process. Moreover, there is every 
possibility of manual errors effecting the results. Hence it 
has become rather imperative to automatize this process of 3D 
scene analysis. As an era of ANNs has again begun after an 
eclipse of 10 years and this complement of AI is being found 
useful for tackling such problems, it has been planned to 
make efforts to use ANNs for our purpose. But, it is still a 
very new field wherein trial and error is going on and 
whatever theory has been propagated is still, probably, in 
the form of hypothesis and not a law. The results achieved 
also range from a meagre 5% to as high as 98%. 

The object of the present study is two fold. At the 
first instance an indepth knowledge of the field of ANNs has 
been gained which comprises of the fundamentals of ANNs, 
types of ANNs, types of training of ANNs, the algorithm 
involved in training and the working of ANN. Thus we could 
know the present state of this art. Secondly an extensive 
study revealed how ANNs can be applied to the present field: 
Image processing in Traffic Engineering. An effort has been 



made to apply the knowledge to a practical problem of shape 


recognition of vehicle. The 3D views of vehicle at different 
angles were converted into 2D picture frames and these were 
digitised as per a definite scheme as is used in the case of 
alphabet recognition. A software on ANN using 
Backpropagat ion algorithm has been developed and an attempt 
has been made to train the ANN with these digitised images of 
vehicles so that the network gets trained to recognize any 
similar pattern (digitized image) which it may be exposed to. 

1.5 ORGANISATION OF THESIS 

The thesis has been organised under the following heads: 

1. An introduction to AI and ANN, problem statement, object 
of study and organisation of thesis is contained in 
Chapter 1 . 

2. Chapter 2 contains an extensive literture review including 
the work done in the international arena in recent past on 
ANNS specially their applications to Civil Engg. 

3. Chapter 3 contains the fundamentals, types, details of 
training and working of ANNs together with a description of 
Backpropagation Algorithm. 

4. The whole process of image processing, description of 3D 
computer vision systems, analog to digital conversion of 


visual scenes, and procesing of quantised data are the topics 



visual scenes, and procesing of quantised data are the topics 


which have been dealt with in Chapter 4. 

5. A description of the manual processing (including 
instrument setup) and the proposed automatised processing 
besides a desription of the software developed for the 
purpose form the contents of Chapter 5. 

6. The summary, conclusions, and scope for future work are 
the topics that have been included in the concluding chapter. 


the Chapter 6. 



LITERATURE REVIEW 


2.1 HISTORICAL PERSPECTIVE 

The improved understanding of the functioning of 
the neuron and the pattern of its interconnections has 
allowed researchers to produce mathematical models to test 
their theories. Two mutually reinforcing objectives of neural 
modelling were defined and remain today : 

1 . To understand the physiological and psychological 
functioning of the human neural system. 

2. To produce ANNs that perform brain like functions. 

Models of human learning were developed of which one, 
that has proven to be the most fruitful, was that of D.O.Hebb 
in 1949 (Wasserman, 1989). He proposed a learning law that 
became the starting point for ANN training algorithms. 

Early successes produced a burst of activity and 
optimism and networks consisting of single layers of 
artificial neurons were developed called PERCEPTRONS which 
were applied to diverse fields like weather prediction, 
electrocardiogram analysis and artificial vision. 

But Marving Minsky’s researches led to the publication 
of his book Perceptrons in which he and Seymore Papert proved 



including the functions performed by a simple exclusive OR 


gate (Wasserman, 1989). Minsky was not optimistic about 
progress even. This discouraged most of the researchers. Yet, 
a few dedicated scientists such as Teuvo Kohonen, Grossberg, 
Anderson etc continued their efforts. Gradually a theoretical 
foundation emerged upon which the more powerful multilayered 
neural networks of today have been constructed. 

ANNs today: 

There have been many impressive demonstrations of ANN 
capabilities, a few of which using one bacpropagation 
Algorithm are : 


j S.No Functions Performed 

1 

Inventor j 

1 

1. Conversion of text to speech 

j 

Seijnowsky and j 

( 

1 

Rosenberg( 1987 ) ^ 

1 

; 2. Recognition of handwritten 

i 

! 

! 

5 

5 

Burr, 1987 

1 characters 


3 . Image Compression 

Ottrel , Munro 

- 

and Zipser 

1987 


Many other algorithms have been developed and been used in 


other types of networks. 




2.2 TRAINING NEURAL NETWORKS FOR TRAFFIC CONGESTION 


(For specific terms used chapter 3 may be referred) 

A major problem for all automatic control systems, both 
urban and rural, is deciding whether a particular road link 
is congested. This is because "congested" is a very 
subjective function and depends on the context in which the 
link is situated. A second difficulty is that the value of a 
single parameter such as vehicle flow or occupancy is not 
usually sufficient to diagnose congestion. Hence a variety of 
factors are used by an experienced operator viewing a link by 
a GCTV to decide that a particular link is congested. The 
unfortunate part is that it is not easy to express this type 
of decision making algorithmically. Another problem is that 
congestion on a link can’t be removed without affecting other 
links which necessitates the consideration of a sub area of 
several links. A likely solution to this problem is the 
Artificial Neural Network approach as the network itself 
would develop the required relationships between data for 
different links. 

EXPERIMENTAL APPROACH: To explore the possible ways of 
approaching this, problem data were provided by the 
University of Nottingham, collected via an on line computer 
link to the SCOOT traffic control system of Leicester. The 



Figure^leiRMS Error (Single Parameter) 



Figurc^llli^RNtS Error Curve (Two Parameters) 


RMS I 



Figure ^Uc))R MS Error curve (Three Parameters) 
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data consisted of values recorded over 20 days every 5 
minutes, of 3 different parameters over 40 links: vehicle 
flow, queue length, and percentage of total free flow 
capacity used. These parameters were checked against an on 
street survey carried out as part of another project. An 
expert familiar with the road system, then, provided a list 
of time periods when a small sub area was considered to be 
congested after thoroughly going through the data. It was 
then analyzed whether a model could be built which carried 
out the same diagnosis. 

The approach taken was to train a back propagation 
neural network using the data set collected. The inputs 
consisted of the current congestion parameters from 20 
neighbouring detector sites, normalised to between 0 and 1. 
The output was a binary variable indicating whether the sub 
area was considered to be congested or not. 

Three different classes of experiments were conducted 
wherein .the training set numbered about 400 examples for each 
class (for details of training refer to Chapter 3). The 
training was halted manually when either of the following two 
conditions occured : 

(i) The RMS error had decayed to nearly zero and variations 
were no longer considered significant. 

(ii) The RMS error was oscillating randomly within certain 
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ranges and further convergence appeared unlikely. This 
condition depends on intuition and is easily recognizable by 
experience . 

One of these criteria was, typically, fulfilled after 
about 12,000 iterations and hence graphs of RMS error were 
presented for 16,000 iterations. 

Class I : 

Three neural networks were trained using each congestion 
parameter in turn but in each case the training was halted 
after little or no convergence was observed (Fig. 2. 2a). 

Class II : 

Three more neural networks were trained, covering all 
possible combinations of two parameters. Some convergence was 
noted for all the three networks, although when the training 
was stopped the RMS error has not reached zero , and quite 
considerable oscillation was still apparent (Fig. 2.2b). 

Class III; 

A neural network was, finally, trained with all the 
three parameters and the convergence was found to be quite 
good with little final instability. (Fig. 2.2c). 

CONCLUSIONS :Th.e data set being not very large care has to be 
taken while interpreting this set of results. In an ideal 
case the neural network must be tested after training by 


presenting a further set of data which may not be possible 
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due to limited data. However, the fact that convergence 
occured is a clear indication that an ANN approach is likely 
to be successful and further research must follow. The fact 
that convergence occured with all the three parameters taken 
together implies that congestion can be better diagnosed by 
observation of several parameters and that ANNs seem to be a 
good solution to the problem of interpreting large amounts of 
data which are interrelated but for which no straight forward 
algorithm can be found. 


2.3 AN OVERVIEW OF RECENT RESEARCH IN THE FIELD OF ANN 

The following works have been done in international 
arena in the recent past: 

1. A paper on road traffic monitoring using one TRIP 2 system 
has been presented. This paper contains a brief review of 
present image processing systems used for traffic monitoring, 
including a discussion of the disadvantages of such " systems. 
The vehicle detection algorithm relies upon the ability of 
TRIP 2 system to learn from example and discriminate between 
complex patterns within video images of traffic scenes. 
During site trials of the system it was possible to detect 
99% of the vehicles and individual vehicle speeds to an 


accuracy of between plus or minus 8% & 17% respectively 
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(Dickension, K.W. and Wan, C. L.)- 

2. A Self Organising Traffic Controll System has been 
developed using Neural Network Model (Nakatsuji, T. and Kaku , 
T., 1991). 

3. Neural Networks have also been used for automated vehicle 
dispatching. An alternative Neural Network Model was proposed 
as a sub symbolic and empirical alternative for modelling the 
decision process of expert dispatchers (Potwin, J.Y. , Shen, 

Y. , and Rousseau, J.M.). 

4. An Intelligent System for Automated Pavement Evaluation 

has been developed. This research is directed towards an 
innovative, noncontact, intelligent, nondestructive 

evaluation (INDE) system, using a novel AI based approach 
that would integrate 3 AI technologies viz. Computer vision, 
Neural networks, and Knowledge based expert system wherein 
multilayer perceptron and backpropagation learning rule have 
been used (Ritchie, S.G., Kaseko, M. , and Bavarian, B. , 1991) 

5. A Neural Network Model for Freeway Incidence detection has 
been developed. This paper presents the initial results of an 
exploratory study investigating the application of Neural 
pNetwork Models from the field of AI to the automated 
detection of non recurring congestion on urban freeways. The 
results are encouraging (Cheu, R.L. , Ritchie, S.G., Recker, 
W.W. and. Bavarian, B. ) 
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6. A paper on pavement image processing has been presented 
wherein the potential for employing neural network model in 
the pavement image interpretation has been discussed and 
preliminary results presented (Kaseko, M.S., and 

Ritchie , S . G. ) 

7. A self organising traffic control system 

using neural network models has been developed .( Nakasuj i , T., 
and Kaku , T . , 1991). 

8. A modular neural network has been developed for 

recognition of car registration plates (Margarita, G. , 1990). 

9. The drivers’ behaviours using a driving simulator have been 
studied (Takubo, N., 1991). 

10. Neural Network has been used in the development of an 


autonomous land vehicle (Pomerleau, D.A., 1989) 
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CHAPTER 3 
THEORY 

3.1 FUNDAMENTALS OF ANNs 

As already mentioned, ANNs are biologically inspired, 
that is, when considering network conf iguraticr. and 
algorithms, the researchers usually keep in mind the 
organization of the brain. Knowledge about the brain’s 
overall operation is so limited that there is little t; guide 
those who would emulate it. Hence, network designs must go 
beyond current biological knowledge seeking structures that 
perform useful functions. Despite this the artificial neural 
networks continue to evoke comparisons with the brain. These 
functions are often reminiscent of human cognition. 

The human nervous system is built of cells called 

neurons and is a highly complex structure. An estimated 

11 l5 . 

10 neurons participate in perhaps 10 interconnections over 

transmission paths that may range for a metre or more. Each 

neuron shares many characteristics with the other cells in 

the body but has unique capabilities to receive, process and 

transmit electrochemical signals over the pathways that 


comprise the brain’s communication system. A study of the 
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structure of the neuron will clarify the working. 

Before going into details, it is necessary to get 
acquainted with the following terms that shall be used 
frequently : 

NEURON: It is the basic building block of a Neural Network, 
also known as a neuron or processing element. 

CONNECTION: a signal transmission pathway between processing 
elements, corresponding to axons and synapses in a human 
brain. 

LAYERS: In theory any topological arrangement should work but 
for ease in analysis and visualisation it is usual to arrange 
the nodes in layers with all nodes in adjacent layers 
connected to each other. 

WEIGHTS: an adaptive coefficient associated with a single 
input connection. It determines the intensity of the 
connection . 


PROCESSING ELEMENT: an artifical neuron in a neural network 
consists of a small amount of local memory and processing 
power and hence the name . 

LEARNING LAW: an equation that modifies some or all of the 
adaptive coefficients (weights) in a processing element’s 


local memory in response to input signals and the values 
supplied by the transfer function. 
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transfer FUNCTION; a mathematical formula that, amongst other 
things, determines a processing element’s output signal as a 
function of the most recent input signals and the weights in 
local memory . 

AXON: the connection emerging from the cell body of a typical 
biological neuron. 

DENDRITES: finer sub connections emerging from axon. 

Structure of a pair of typical biological neuron (3.1 a) : 

The figure 3.1(a) shows the structure of a pair of 
typical biological neurons. Dendrites extend from the cell 
body to other neurons where they receive signals at a 
connection point called synapses. On the receiving side of 
synapses these signals are conducted to the cell body where 


they are 

summed , 

some 

inputs ending to 

excite 

the 

cell 

whereas > 

othei-s 

tending 

to inhibit its 

f iring . 

When 

the 

cumulative excitation in the cell body exceeds a 

threshold 

the cell 

f ires , 

sending 

a signal down the axon 

to 

other 

neurons . 

Although this 

basic fundamental 

outline 

has 

many 


complexities and exceptions yet most of the ANNs model only 
these simple characteristics. 

The artificial neuron (Fig.l b.) : 

This was designed to mimic the first order 
characteristics of the biological neuron. The working is : 











QUT = r(NCT) = 1/(1 > 
r'<NET> = OUTCI-OUT) 

Figure 3-2{<t)Signioidal Activation Function 



Figure S-'Z. b. Hyperbolic Tangent Function 
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* A set of inputs, each representing the output of another 
neuron, are applied. 

* Each input is multiplied by a corresponding weight 
analogous to a synaptic strength. 

* All the weighted inputs are then summed to determine the 
activation level of the neuron. 

Although there are lot many network paradigms but all 
are based upon this configuration. As per the figure, the 
output is, 

NET = ^ (3.1) 

where , 

X=vector representing the set of inputs 
W=vector representing corresponding 
weights (synaptic strengths) 

Activation Function : 

The NET signal is usually further processed by an 
activation function F to produce the neuron’s output signal 
OUT. This may be in any of the following form : 

(a) a simple linear function, 

OUT = K ( NET ) (3.2) 

where K is a constant, a threshold function 
OUT= 1 if NET >T 

OUT= 0 otherwise, 


T is a constant threshold value or a function which more 
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accurately simulates the nonlinear transfer characteristics 
of a biological neuron. 

In the Fig. 3.1 (b) the block labled F accepts the NET 
output and produces the signal OUT. If the F processing block 
compresses the range of NET so that OUT never exceeds some 
low limits regardless of the value of NET, then F is called a 
squashing function. The squashing function is often chosen to 
be a logistic or sigmoid (meaning S shaped) function. 

Mathematically, it can be expressed as, 

F{x) = l/(l + e~’^) (3.3) 

Hence , 

OUT = 1 /(I + (3.4) 

The central high gain region of the logistic function 
solves the problem of processing small signals while its 
regions of decreasing gain at the positive and negative 
extremes are appropriate for large excitations. 

(c) Sometimes hyperbolic function OUT = tanh(x) is .also 
used (fig 3.2b). Like the logistic function, this is also S 
shaped but is symmetrical about the origin resulting in OUT 
having the value zero when NET is zero. Unlike the logistic 
function the hyperbolic one has a bipolar value for OUT which 


is beneficial in certain networks. 
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Single layered ANNs (Fig. 3.1 c) 

Although a single neuron can perform certain simple 
pattern detection functions, the power of neural computation 
comes from connecting neurons into networks. The simplest 
network is a group of neurons arranged in a layer as shown in 
the Fig 3.1c. The circular nodes only serve to distribute the 
inputs, they perform no computations and hence will not be 
considered to constitute a layer. The set of inputs X has 
each of its elements connected to each artificial neuron 
through a separate weight. Early ANNs were no more complex 
than this. 

Therefore calculating outputs N from a layer is simple 
matrix multiplication. 

N = X W (3.5) 

If this sum is greater than a predetermined threshold, 
the output is one otherwise it is zero. These systems and 
their many variations collectively have been called 
Perceptrons (Fig. 3.1 d). Despite the limitations of 
Perceptrons, they are a logical starting point for a study of 
ANN. 

REPRESENTATION : It refers to the ability of a perceptron to 
simulate a specified function. A single layered perceptron is 
seriously limited in its representational ability. There are 
many simple machines that the perceptron can’t represent no 
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matter how the weights are adjusted. One of Minsky’s more 
discouraging results shows that a single layered perceptron 
can’t simulate a simple exclusive OR function. 

Multilayered ANNs {Fig. 3.1 e) 

Greater computational capabilities are offered by larger 
more complex networks. Miultilayered networks may be formed by 
simply cascading a group of single layers, the output of one 
layer providing input to the subsequent layer. Calculation of 
the output of the layer is, 

N = ( X 

= X (W^ W^) (3.6) 
where represents the second weight matrix. This shows that 
a two layered linear network is exactly equivalent to a 
single layered one having weight matrix equal to the product 
of two weight matrices. 


Recurrent Networks 

More general networks that do contain feedback 
connections are said to be Recurrent Networks. 

3.2 TRAINING AND WORKING OF ANNs 

Out of all the interesting characteristics of ANNs the 


most important and eye catching is their ability to learn. 
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Their training shows so many parallels to the intellectual 
development of human beings that it may seem that a 
fundamental understanding of the process has been achieved 
{Fig. 3.3 gives a general idea of the working of ANNs wherein 
a simple artificial neural network consisting of a retina 6x6 
receptor cells, 10 hidden cells, and 1 output cell has been 
shown. Only those connections originating from one of the 
receptor cells have been illustrated for clartiy). 

Learning in ANNs is limited and many difficult problems 
remain to be solved before it can be determined if we are 
even on the right track. 

Objective of Training : 

A network is trained so that application of a set of 
inputs produces the desired or at least consistent set of 
outputs. Each such input (or output) set is referred to as a 
vector. Training is accomplished by sequentially applying 
input vectors while adjusting network weights according to a 
predetermined procedure. During training the network weights 
gradually converge to values such that each input vector 
produces the desired output vector. 

Types of Training : 

Training can be categorised into : 
a) Supervised training : This requires the pairing of each 


input vector with a target vector representing the desired 
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output; together these are called a Training Pair. 

Usually a network is trained over a number of such 
training pairs. An input vector is applied, the output of 
the vector is calculated and compared to the corresponding 
target vector. The difference (error) is fed back through the 
network and weights are changed according to an algorithm 
( Backpropagation being the most common) that tends to 
minimize the error. The vectors of the training set are 
applied sequentially, the errors calculated and the weights 
adjusted for each vector until the error for the entire 
training set is at an acceptably low value. 

b) Unsupervised Learning: Inspite of many successes, 
supervised training has been criticized it being difficult to 
conceive of a training mechanism in brain that compares 
desired and actual outputs feeding processed corrections back 
through the network. Several questions remained unanswered : 

If this were the brain’s mechanism, where do the desired 
outputs come from? 

How could the brain of an infant accomplish the self 
organization that has been proven to exist in early 
development? 

Thus unsupervised learning is a far more plausible model 
of learning in the biological system. Developed by Kohonen 
(1984) and many others, it requires no target vectors for the 
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outputs and hence no comparisons to predetermined ideal 
responses. The training set consists solely of input vectors 
and the training algorithm modifies network weights to 
produce output vectors that are consistent. The training 
process therefore groups similar vectors into classes. 

Training Algorithm : 

Most of today’s training algorithms have evolved from 
the concepts of D.O Hebb (1961). He proposed a model for 
unsupervised learning in which the synaptic strength (’--eight) 
was increased if both, the source and destination ne irons, 
were activated. In this way often used paths in a network are 
strengthened and the phenomena of habit and learning through 
repetition are explained. 

An ANN using Hebbian learning will increase its network 
weights according to the products of the excitation levels of 
the source and destination neurons. 

However more effective learning algorithm including 
those for supervised learning have been developed. 

The Delta rule 

This is an important generalisation of the perception 
training algorithm which extends this technique to continuous 
inputs and outputs. The perception training algorithm may be 
generalised by introducing a term 5 which is the difference 
between the target output T and the actual output A. 



(3.8) 


For the correction associated with the i^ input 
L = ■■■? 5 X 

i 

W (n + 1) = W (n) + 1 (3.9) 

L It 

where T: = learning rate coefficient, 

W (n + 1) = the value if weight i after adjustment, 
v 

W (n) = the value of weight i before adjustment. 

The delta rule modifies weights appropriately for target 
and actual outputs of either polarity and for both continuous 
and binary inputs and outputs. 

3.3 BACKPROPAGATION ALGORITHM 

For supervised training of multilayer artificial neural 
network, the algorithm which proved to be a boon is the back- 
propagation algorithm which shall be discussed in detail 

here . 

It is a systematic method for training multilayer 

artificial neural networks. Despite its limitations it has 

dramatically expanded the range of problem to which 

artificial neural networks can be applied. 

As discussed earlier, the neuron as shown (Fig. 3.4a ) is 
used as the fundamental building block for back propagation 


algorithm. 
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NET = q vj + Oj Wj + "21 "T ", 

1 » I 

□UT = F(NET) 

Figure Artificial Neuron with Activation Function 
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’1 

NET = O.W + O W +....+ 0 W = Z O.W. (3.10) 

11 ^ n.n. = xii 

OUT = F(NET) (3.11) 

Activation function usually used for bac.--propagat ion is 

-NET 

OUT = F(NET) = 1/(1 + e ) (3.12) 

F’(NET) = d OUT/d NET = OUT ( 1 - OUT) (3.13) 

The sygmoid often called logistic or squashing function 


comprises the range of NET so that OUT lies between 0 and 1. 
The squashing function produces non linearity resulting which 
multilayer networks have greater representational power. Some 
other functions may also be used for backpropagation provided 
they are differentiable every where. 

Back propagation can be applied to networks with any 
number of layers, however it can easily be understood by 
demonstration on a network with two layers of weights (Fig. 
3.4b). The first layer of neurons serves only as a 
distribution layer. Each neuron in subsequent layers produces 
NET and OUT signals. A neuron is associated with a set of 
weights that connects it with the input. 

An Overview of Training 

The objective of training is to adjust the weights so 
that application of a set of inputs produces the desired set 
of outputs. These input output set can be referred to as 
vectors. 'Each input vector is paired with a target vector 


representing the desired output and together these are called 
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a training pair. Usually a network is trained over a number 
of training pairs and the group of training pairs is called a 
training set. 

Before starting the training process all the weights 
must be initialized to small random numbers which ensures 
that the network is not saturated by large values of weights. 

Following steps are followed for training the 
backpropagation network (A flow chart for backprpagat ion has 
been shown in Fig. 3.4.1) : 

1. The next training pair is selected from the training set 
and the input vector is applied to the network. 

2. The output of the network is calculated. 

3. The error between the network output and the desired 
output (The target vector from the training pair ' is 
calculated . 

4. The weights of the network are adjusted in a way that 
minimizes the error. 

5. The steps from 1 through 4 are repeated for each vector 
in the training set until the error for the entire set is 
acceptably low. 

The operation required in the steps 1 & 2 above are 

similar to the way in which the trained network will 
ultimately be used i.e., an input vector is applied and the 


resulting output is calculated. 
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Forward pass 

This consists of Steps 1 & 2 in which the signal 

propagates from input to output. 

Reverse pass 

This can be studied under two heads, 

(a; Adjustment of the weights of the output layer 
(Fig. 3.4c) 


Adjusting associated weights in this layer is 
easy because a target value is available for each neuron in 
this layer thereby making possible the use of Delta rule 
slightly modified. 

If, for example, a single weight from a neuron p in the 
hidden layer j to a neuron q in the output layer k is to be 
trained then the output of a neuron in layer is subtracted 
from its target value to produce an error signal which is 
then multiplied by the derivative of the sq\iashing function 
calculated for that neuron of the layer k, thereby producing 
the d value , 

6 = OUT (OUT -1) (Target-OUT) (3.14) 

the modification or change in weight is given by , 

Aw = r? 6 OUT (3.15) 

pq-ii q.--< pd 

therefore , 


W 




(n+1) 



(n) 



(3.16) 


where , 
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W (n) - the value of weight from neuron p in 

pq,k 

the hidden layer to neuron q in the 
output layer at step n (before adjustment) 

W (n+l) = value of the weight at step (n+1) 

pq,k 

after adjustment 

.= the value of ifor neuron q in the output 

q.K 

layer k 

OUT = the value of OUT for neuron p in the 

p.i 

hidden layer j 

It should be noted that subscripts p and q refer to 
specific neurons while j and k refer to layers. 

(b) Adjustments of weights of Hidden layers (Fig. 3.4d) : 

Since the hidden layers have no target vectors, so 
the training process mentioned above can’t be used. 
Backpropagation provides a workable algorithm for this which 
trains the hidden layers by propagating the output error 
back through the network layer by layer, adjusting weights at 
each layer. The Equations (3.15) and (3.16) are also 
applicable here, but for the hidden layers -6 must be 
generated without a target vector. Thus, 

c- = OUT , (1-OUT) (I ^5 , W , ) (3.17) 

9 >] P.J q q.k pq.k 

. The above equation implies that, first, 6 is calculated 
for each neuron of the output layer which is used to adjust 


weights feeding into output layer. It is then propagated back 
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through the same weights to generate value of 6 for each 
neuron in the hidden layer which, in turn, are used to adjust 
the weights of the hidden layer. 

Caveats 

Despite many successful applications of the 
Backpropagation algorithm there are certain drawbacks : 

* Long uncertain training process which may be due to non 
optimum step size. 

* Outright training failures which may be from two sources 
viz. Network paralysis (training coming to a standstill due 
to large weights in region of small derivative of squashing 
function) and Local minimas. 

3.4 NETWORK BIAS 

All but the simpelest of recognition systems are 
unlikely to be perfect. Taking the visual system as an 
example, perfect recognition implies that an animal 
unerringly reacts to all images of the correct object (or 
class of objects) and never reacts to all other inappropriate 
images. But a recognition mechanism can only be expected to 
react appropriately to those images it has been selected to 
identify. One can not predict with certinity how an animal 
will react to new images it experiences. Many will have no 
effect, but because there is an almost infinite number of 



possible images that the retina may experience, it is 
expected that some of these will elicit a greater response 
than the particular signals to which the system has been 
selected to respond. This has a clear analogy to the Darwin’s 
Principle of Natural Selection. The question of why such 
preferences evolve remains a controversial issue. The 
mechanisms concerned with signal recognition posses 
inevitable biases in response that act as important agents of 
selection on signal form {Magnus Enquist and Anthony Arak, 


1993 ) . 
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CHAPTER 4 

IMAGE PROCESSING 


4.1 3D OBJECT RECOGNITION USING ANNs 

The problem of recognition, mainly speech and shape 
recognition, is, nowadays, one of the most challenging areas 
of research. The main goal of computer vision research is to 
give computers human like visual capabilities so that the 
machine may : 

* sense the environment in the field of view 

* understand what is being sensed 

* take appropriate actions as programmed 
Desired characteristics of an ideal vision system 

(a) It should be possible for the system to analyse scenes 
quickly and correctly. 

(b) The system must be capable of handling sensor data from 
arbitrary viewing direction i.e., it requires a view 
independent modeling technique. 

(c) The system must handle arbitrary complicated real world 
objects without giving preference to either curved or planar 
surfaces . 


(d) It should be able to modify the world model data in order 
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to handle new objects and new situations. 

(e) The system must be capable of handling a certain amount 
of noise in the sensor data without a significant degradation 
in the system performance. 

The above characteristics can be acquired by an 
Artificial Neural Network combined with a good Vision System. 
PROBLEM FORMULATION: 

If an object is given which we have never seen before, 
we usually start to gather information about the object from 
different view points: gathering such information and storing 
the same is called as "model formation". Once familiar with 
many objects we can identify them from an arbitrary viewpoint 
without further investigations. Thus the central issue of 
vision is Identification and Location of objects in the 
environment which involves the critical step of linking of 
incoming visual information to stored object description. 

A good vision system should be an autonomous single 
arbitrary view. The autonomous single arbitrary view 3D 
object recognition problem is to locate and identify 3D 
objects in the environment autonomously using a single and 
arbitrary view. 

OBJECT RECOGNITION: This requires the determination of the 
translation parameters with respect to a known co-ordinate 
system as well as some orientation angles. 
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IDENTIFICATION: to get the shape of the object and to match 
it with the shapes which are stored in the database. 

A given vision system if it can solve the stated 3D 
object recognition problem successfully can be extremely 
useful in a wide variety of applications including autonomous 
vehicle navigation and automatic inspection assembly. 

RECOGNITION SYSTEM COMPONENTS (FIG. 4.1a) 

How can one recognise something unless one knows what 
one is looking for. Hence perception is possible only if we 
have a model of the real world which therefore forms an 
essential component of object recognition system. 

The World Model Module : 

Perception is possible only if we have a model of the 
world, the popular models been classified into two: (a) high 
level or specialised nodels which have been developed mainly 
by work in the area of CAD viz., models of manmade components 
etc; and (b) low level models of image formation also called 
point wise models which have mainly been developed by work in 
the field of photometry where well worked out and easily 
adaptable models of the image formation process are provided. 

The problems faced by both the types as in the case of a 
general purpose robot can be solved by using "parts and 


process model” which is an intermediate one between the two. 
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A good vision system should be capable of learning new 
types of objects. It should function even if the number of 
objects to be recognised is large, if the objects are 
occluded or largely obscured, or even when there are many 
unknown objects that are present in the scene. Moreover it 
should not employ specific object models. 

Note: To build the world models, two theories in common use 
are that of superquadrics and fractals which are not being 
discussed here in detail for the sake of brevity. 

The Sensor Data: 

The input image in a computer vision system usually 
consists of numerical valued pixel arrays, where the pixel 
values are gray levels. In recent past, digitised range data 
have become available and its quality is improving day by 
daj . Range data are often available in the form of array of 
numbers which are referred to as a range image (or depth 
map). Here the numbers quantify the distance from the sensor 
focal plane to object surfaces within the field of view along 
rays eminating from points on a regularly spaced grid. 

The main advantage of using Range imagery (over the 
Intensity imagery) is simple separation of figure from the 
background which simplifies, relatively, the bottomup 
learning of object descriptions. However, finding the correct 
part structure remains as difficult as in the case of 
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intensity imagery. 

The Symbolic Description: 

The sensor data are processed until they reach the form 
of symbolic scene description. The model data can also be 
trnsformed into a symbolic scene description. A matching 
procedure can then be carried out on the quantities in this 
intermediate (symbolic scene description) domain which are 
referred to as features. 

Interaction and mapping between different componer.-s of 
a recognition system have been shown in Figure. 4.1a. 

Mapping (I): It creates intensity or range data. 

Description process (D): acts on the sensor data and e:;~racts 
relevant application independent features. 

Modelling process (M): provides object models for real world 
objects . 

Understanding or recognition process (U): involves an 

algorithm to perform matching between model and data 
descriptions. 

Rendering process (R): produces synthetic sensor data from 
object models. Rendering provides an important link because 
it allows an autonomous system to check on its own 
understanding of the sensor data by comparing synthetic 


images to the sensed images. 
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the proposed object recognition system (fig. 4.1b) 

The essential components of this are: 

(i) The laser range finder (the use of which has already been 

described ) 

(ii) The Recognition Block; Image pixel by themselves can 
determine nothing. It is necessary to have a model of image 
formation in order to obtain any assertion about the viewed 
scene. Hence the need of a model can’t be sidestepped. The 
models used are based on the theory of fractals and 
superquadrics . 

The ability of ANN to learn from experience, to 
generalize on their knowledge, and to perform absiraction 
make them suitable to be used in conjunction with a Vision 
system. 

In fact, vision is the most remarkable of all of 

our intelligent sensing capabilities through which we are 

able to acquire information about our environment without 

direct contact. To the surprise of most of us, a TV Camera 

has a resolution on the order of 500 parts per sq. cm . , while 

the human eye has a limiting resolution on the order of some 

0 

25x10 parts per sq. cm. , thus humans have a resolution 


10,000 

times fine than 

that of 

a 

TV Camera. 

It was. 

thereto 

re, necessary to 

examine 

the 

processes 

and the 


problems involved in building computer vision systems which 
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share some similarities with human vision system and the 
ultimate objective is to determine a high level description 
of a 3D scene with a competancy level comparable to that of a 
human vision system. 

It is very necessary to distinguish between a scene and 
an image of a scene. A scene is the set of physical objects 
in a picture area whereas an image is the projection of the 
scene on to a 2D Plane. Thus, a typical computer vision 
system should be able to perform the following operations: 

1. Image formation, Sensing and Digitization. 

2. Local processing and Image segmentation. 

3. Shape formation and Interpretation. 

4. Semantic analysis and description. 

4.2 OVERVIEW OF VISION PROCESSING 

The input to a vision system is 2D Image collected on 
some form of light sensitive surface. This surface is scanned 
by some means to produce a continuous voltage output that is 
proportional to the light intensity of the image on the 
surface. The output voltage f(x,y) is sampled at a discrete 
number of x and y points or pixel (picture element) positions 
and converted to numbers. The numbers correspond to the gray 
level intensity for black and white images. For colour images 
the intensity value is comprised of three separate arrays of 
numbers, one for the intensity value of each of the basic 
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colours (Red, Green, and Blue). 

Thus , through, the digitization process the image is 
transformed from a continuous light source into an array of 
numbers which correspond to the local image intensities at 
the corresponding x-y pixel positions on the light sensitive 
surfaces . 

Using the array of numbers certain low level operations 
are performed such as a smoothing of neighbouring points to 
reduce noise, finding outlines of objects or edge elements, 
thresholding (recording maximum and minimum values only 
depending on some fixed intensity threshold level) and 
determining texture, colour and other object features. These 
initial processing steps are the ones which are used to 
locate and accentuate object boundaries and other structures 
within the image. 

Then next stage of processing, the intermediate level, 
involves connecting, filling in and combining boundaries, 
determining regions and assigning descriptive labels to 
objects that have been accentuated in the first stage. This 
stage builds higher level structures from the lower level 
elements of the first stage. After completion, it passes on 
labelled surfaces such as geometrical objects that may be 
capable of identification. 

High level image processing consists of identifying the 
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important objects in the image and their relationships for 
subseQ.uent descriptions as well as defined knowledge 
structures and hence for use by a reasoning component. 

Special types of vision systems may also require 3D 
processing and analysis as well as motion detection and 
analysis . 

4.3 OBJECTIVES OF COMPUTER VISION SYSTEMS 

The ultimate goal of computer image understanding is to 
build systems that equal or exceed the capabilities of human 
vision systems. In an ideal case computer vision systems 
would be capable of interpreting and describing any complex 
scene in complete detail. But the amount of processing and 
the storage required to interpret and describe a complex 
scene can be enormous. For example, a single image for a high 
resolution aerial photograph may result in some four to nine 
million pixels (Bytes) of information and require on the 
average some ten to twenty computations per pixel. 

4.4 IMAGE TRANSFORMATION AND LOW LEVEL PROCESSING 

This includes the process of forming an image and 
transforming it to an array of numbers which can then be 
operated on by a computer. In this first stage only local 
processing is performed on the numbers to reduce noise and 
other unwanted picture elements and to accentuate object 


boundaries . 
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TRANSFORMING LIGHT ENERGY TO NUMBERS : 

First stage in image processing requires a 
transformation of light energy to numbers, the language of 
computers. To accomplish this some form of light sensitive 
devices (Transducers) are used such as a Vidicon Tube or 
Charge Coupled Device (CCD). 

A Vidicon Tube is the type of sensor typically found in 
home or industrial Video systems. A lens is used to project 
the image on to a flat surface of the vidicon. The Tube 
surface is coated with a photoconductive material whose 
resistance is inversely proportional to the light intensity 
falling on it. An Electron Gun is used to produce a flying 
spot scanner with which to rapidly scan the surface left to 
right and top to bottom. The scan results in a time varying 
voltage which is proportional to the scan spot image 
intensity. The continuously varying output voltage is then 
fed to an analog to digital converter (ADC) where the voltage 
amplitude is periodically sampled and converted to numbers. A 
typical ADC unit will produce thirty complete digitized 
frames consisting of 256x256 or 512x512 (or more) samples of 
an image per second. Each sample is a number (or triple of 
numbers in the case of colour systems) ranging from 0 to 64 
(6 bits) or 0 to 255 (8 bits). The image conversion process 
has been shown in the Figure 4.2b. 
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A CCD is typical of the class of solid state sensor 
devices known as charge transfer devices that are now being 
used in many vision systems. A CCD is a rectangular chip 
consisting of an array of capacitative photo detectors, each 
capable of storing an electrostatic charge. The charges are 
scanned like a clock driven shift register and converted into 
a time varying voltage which is proportional to the incident 
light intensity on the detectors. This voltage is sampled and 
converted to integers using an ADC unit as in the case of the 
vidicon tube. The density of the detectors on the chip is 
quite high. For example, a CCD chip of about 5 sq. cm. in 
area may contain as many as 1000x1000 detectors. 

The numeric outputs from the ADC unit are collected as 
array of numbers which corresponds to the light intensity of 
the image on the surface of the transducer. This is the input 
to the next stage of processing. 

PROCESSING THE QUANTIZED ARRAYS: 

The array of numbers produced from the image sensing 
device may be thought of as the lowest, most primitive level 
of abstractions in the vision understanding process. The next 
step in the processing hierarchy is to find some structures 
among the pixels such as pixel clusters which define object 
boundaries or regions within the image. Thus it is necessary 
to transform the array of raw pixel data into regions of 
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discontinuities and homogeneity, to find edges and other 
delimiters of of these object regions. A raw digitized image 
will contain some noise and distortion, hence computations to 
reduce these may be necessary before locating edges and 
regions. Other low level operations include thresholding to 
help define homogeneous regions, and different forms of edge 
detection to define boundaries. 

THRESHOLDING: 

This is the process of transforming the gray level 
representation to a binary representation of the image. All 
digitized array values above some threshold value T are set 
equal to the maximum gray level value (black) and value less 
then or equal to T are set equal to zero (white). Thus 
thresholding is one way to segment the image into sharpened 
object regions by enhancing some portions and reducing others 
like noise and other unwanted features. Thresholding at 
different levels may be necessary to handle extreme 
intensities. The best choice of threshold value can be 
decided by a study of histogram of light intensity levels 
( Fig .4.2c). 

SMOOTHING: (Fig. 4.3a) 

It is a form of digital filtering. It is used to reduce 
noise and other unwanted features and to enhance certain 
image features. Techniques involved are many* some of which 
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are local averaging, use of models and parametric form 
fitting. 

LOCAL EDGE DETECTION : 

It is a process of finding a boundary or delimiter 
between two regions. An edge will show up as a relatively 
thin line or arc which appears as a measurable difference in 
contrast between otherwise homogeneous regions. Several 
approaches have been proposed for locating the boundaries: 

* Boundaries separating adjoining regions represent a 
discontinuity in one or more of features, such as colour, 
texture, three dimensional flow effects or intensity, a fact 
which can be exploited by measuring the rate of change of a 
feature value over the image surface. 

* There is some evidence to support the belief that the 
humsLn eye uses a form of Gaussian transformation called 
lateral inhibition which enhances the contrast between 
gradually changing objects like an object and its background. 

* Another approach used to filter the digitized image 
applies frequency domain transforms such as the Fourier 
Transforms. Since edges represents higher frequency 
components the transformed image can be analysed on the basis 
of its frequency. An efficient algorithm called as Fast 
Fourier Transform has been developed which when applied to an 
array of intensity values produces an array of complex 




Orrgina! image array 


Smoothed image array 


Figure 4*3(^) Application of 


a smoothing mask 




5;s*^?l3a 
















W^Si&'fS 


iP'tgure 4'3(b) Examples of textured surfaces 









57 


numbers that correspond to spatial frequency components of 
the image . 

* Another method is that of model fitting: a model in 
the form of a mask is shifted over a region and compared to 
the corresponding gray levels. 

Texture and color are also used to identify boundaries: 
Texture is a repeated pattern of elementary shapes occurring 
on an object’s surface. The structure in texture is generally 
too fine to be resolved, yet still coarrse enough to cause 
noticeable variations in the gray levels (Fig 4.3b). These 
methods of analysis which have been developed are based 
on. (a) the application of pattern matching; (b) the use of 
Fourier Transform; and (c) modeling with special functions 
known as fractals. 

Color when used to identify regions requires more than 
three times as much processing as gray level processing 
because the image must, first of all, be separated into three 
primary colors (Fig. 4. 3c ). In complex scene analysis color 
may be the most effective method of segmentation and object 
identification. 

4.5 INTERMEDIATE LEVEL IMAGE PROCESSING 

This concentrates on segmenting the image surface into 
larger global structures using homogeneous features in pixel 
regions and boundaries formed from pieces of edges discovered 
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during the low level processing. 

This level re<iuires that pieces of edges be ccnbined 
into contiguous contours which form the outline of objects, 
partitioning the image into coherent regions, developing 
models of segmented objects and then assigning labels which 
characterize the object regions. The general process of 
forming contours is called segmentation, the methods employed 
for which are: (i) Graphical edge finding; (ii) Edge f.nding 
with Dynamic programming; and (iii) Region segmentation 
through splitting and merging. 

Once the image has been segmented into disj tinted 
regions, their shapes, spatial interrelationships, and Dther 
characteristics can be described and labeled for subsequent 
interpretation which may be 2D or 3D ' scene description . 
Template Matching is the process of comparing patterns found 
in an image with prestored templates that are already tamed. 
The matching process may occur at lower levels using 
individual or group of pixels using correlation techniques or 
at higher image processing levels using labeled region 
structures . 

Template matching can be effective only when the search 
process is constrained in some way, for example, the types of 
scenes and permissible objects should be known in advance, 
thereby limiting the possible pattern-template pairs. 
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4.6 HIGH LEVEL PROCESSING 

Hish level processing techniques are less mechanical 
then either of the former two. In this the intermediate level 
region descriptions are transformed into high level scene 
descriptions in one of the knowledge representation formalism 
like associative nets, frames, FOPL statements etc. 

The end objectives of this stage is to create high level 
knowledge structures which can be used by an inference 
programme. It is obvious that the resulting structure should 
uniquely and accurately describe the important objects in an 
image including their interrelationship. In fact this 
particular vision application will dictate the appropriate 
level of detail and what is considered to be important in a 
scene description. 

Before a scene can be described in terms of high level 
structures, prestored model descriptions of the objects must 
be available which are compared with the region description 
created during the intermediate level stage. 

Associative networks have become a popular method of 
scene descriptions since they show the relationship among the 
objects as well as object characteristics. 
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CHAPTER 5 

Present Study 


5.1 DATA AVAILABLE 

The form in which the data is and can be available to us does 
('effect the planning for future developmental activity. A very 
common method in current practice for keeping a record of the 
field traffic data i.e., of the incidents occuring on a road 
stretch, is the preparation of video cassettes by operating a 
video camera sitting in a test vehicle so as to have an 
overall view of traffic interaction. A large number of such 
cassettes have been prepared by CRRI, New Delhi on various 
National and State Highways (during January 1992 to April 
1993). A special Video Instrumentation System developed by 
Australian Road Research Board is mounted on a test vehicle 
which is a Maruti van in this case. 

The essential components of the Instrumentation system, 
without going into much details, are: 

(i) Video cameras— one at the front to view the oncoming 
vehicles and one at the rear for viewing at the back. 

(ii) Video recorder 
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(iii) Video monitor 

( IV ) Video splitter ( nult iviewer ) which enables the pictures 
taken by more than one video camera to be recorded in a 
single picture frame i.e., for viewing the pictures taken by 
the two cameras simultaneously it divides the monitor frame 
into two parts horizontally (or vertically). It can even 
divide the monitor screen into four parts. 

(v)Video timer 

These may be supplemented with: 

( VI )Rotopulser which is used for vehicle interaction studies. 
In this, pulses are sent, counted, and sent to the video 
clock for measuring distance along the road when the; 
instrumented vehicle is run. This replaces the inaccurate 
odometer and gives the distance co-oxdinates in metres (Least 
count is 1 m. ) . 

(vii)Speedo odometer 

(viii)Radar speedmeter which is used to detect the speed of a 
moving vehicle. 

The first line of the digitized data on screen 
indicates the date in year, month, and day followed by time 
in hours and minutes. The last figure of the second line 
completes the time indication in tenth of a second. The other 
figures in the second line indicate relative speed of the 
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oncoming vehicle with reference to the test vehicle (in Kmph) 

, speed of the test vehicle (in liinph) and distance coordinate 
of the test vehicle (in m) . Thus it is in this form the field 
data is and can be available to us. 

5.2 MANUAL IMAGE PROCESSING 

In order to obtain various parameters and to establish 
relationship between different parameters related to traffic 
interaction, presently the video cassettes are being 
processed manually. The instrument setup for this has been 
shown in the Fig. 5.1(a). 

The signal from the VCR goes to the Video scalar from where 
it goes to the video splitter which passes it on to the video 
monitor. The video scaler helps creating a scale/grid on the 
screen to facilitate the noting down of the width of the 
vehicle and the splitter helps divide the screen into two 
parts to watch the views from the two cameras simultaneously. 
To note down the reading as soon as a clear picture of an 
oncoming vehicle comes in the field of view of the front 
camera the scene is paused and with the help of video scaler 
the image size is noted. There are polynomials calibrated for 
directly getting the distance of the oncoming vehicle from 
the test vehicle knowing the image size (Fig. 5.1b). If at 
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"th© first instance the image size gives the distance dl at 
the time tl and then another image size gives the distance as 
d2 at time t2j we can have the relative speed of the vehicle) 

= (dl - d2)/(tl - t2) 

The calculation of the parameters such as velocity etc. help 
improve the safety. Moreover a study of these cassettes help 
in classifying the various vehicle types running through 
particular National Highways, State Highways and other roads 
for prediction of future traffic. 

5.3 AUTOMATIZED IMAGE PROCESSING USING ANNs 

For improved design of transportation facilities if such a 
huge amount of data is processed manually enormous amount of 
time shall be required. Hence, our basic purpose is to 
automatize this process of image processing so as to help in 
the design of automatized intersection control, automatized 
I accident warning systems etc. 

To begin with this challenging task the first step that 
has been planned is to use the ANNs in the recognition of 
vehicles from their 3D scene (after converting the 3 D scene 
into a 2D digitized image). The conversion of 3 D scene to a 
2D digitized one has to be through a computer vision system. 
Even if, only recognition of the various vehicles that form 
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the traffic stream is successfully possible much of the 
me-Hual work ^ets reduced and achievement of higher ^onls 
through this automatization (involving dynamics also) seems 

possible . 

It will not be out of place to mention that the 
digitization of the image of vehicles can also be done in the 
form of a 8x8 matrix or 10x10 matrix by using video scaler 
which forms grids on the screen enabling us to adjust the 
vehicle image (front views at different angles have been 
chosen in our study) within 8x8 or 10x10 square blocks formed 


on the screen (Fig. 5.2a). 

and 

then 

assigning value 

0 

to 

blocks not containing any 

part 

of 

the vehicle and 

1 

to 


blocks which contain any part. The instrument set up for this 
type of digi tization<^as been sho^j/n in Fig. 5.2b. 

5.4 SOFTWARE-ITS INPUT& OUTPUT 

A software has been developed using the Backpropagation 
algorithm for ANN training (Fig. 5.3 shows the model) in 
which dimensions have been so specified that it can work for 
100 input, 4500 hidden layer and 3 output nodes; if the no. 
of inputs is to be increased the dimensions can be changed 
accordingly but it has been found by trial and error that 
even this much number of nodes require a very large memory 
because as a convention ’n’ input nodes require n(n— 1) 
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tiidclGn lsy 0 r nodes $ hence the Isir^e no* ot weights csin be 
inmgined. Hence i higher is the number of input nodes the 
larger is the memory that is required to be allocated on the 
machine. It is because of this that a small 8*3 matrix (which 
means 64 input elements) has been preferred to train the 
network with. It has been found that a 512x512 matrix will 
require about 100 MB of memory for training the network and 
imposes a serious limitation. For any particular vehicle the 
digitized front views at different angles form the input 
vectors and all of these have a common output vector which 
may be 0, 1, 2, 3 etc. expressed in the binary form. Thus, as 
many training pairs for each vehicle are available as are the 
number of digitized front views at different angles. The 
aforesaid practical difficulties could only be known after an 
extensive trial and error process during the training of the 
network . 

The working of the different modules as shown in 
Fig. 5.3 is as follows: 

Readpatterns : This part of the program reads the training set 
consisting of several input-output training pairs and stores 
it so that there is no need of going through the patterns 
again and again. The number of nodes in the input, output, 
and hidden layer are specified in this phase and the patterns 


are read accordingly. 
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FIG. 5.3 MODEL OF NELFRAL NETWORK SOFTWARE 
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Fresh start/Recover weights: Here it has to be fed to the 
software whether we want to initialise the weights i.e., 
start afresh or we want to recover the weights of the 
previous calculation. If we choose the first option the 
weights are initialised by random number generation otherwise 
weights of previous calculation are recovered. 

Training: This phase of the program uses Backpropagat ion 
algorithm for modifying the weights again and again, 
depending on the error function, till the percentage error is 
within permissible limits i.e., the network reaches a state 
in which it can give correct outp\it when exposed to an input 
pattern similar to the ones it has been trained with. 

Testing: In this phase a similar inpt is fed and the output 
obtained. It is then seen whether this output is right. If it 
is so, the proper working of the network is proven. But, this 
phase requires an extra pattern. 

The neural network was trained with the digitized inputs 
and the training procedure was quite cumbersome because even 
64 (8x8 matrix) input and 1000 hidden nodes mean a large 
number of weights which are to be modified reiteratively for 
the prescribed number of times (say 500 or 1000) and this 
process is repeated again and again after changing the parameters 
of convergence each time till the error reaches an acceptably 
low value. The results of training by 4 training sets follow. 
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Such results favour the use of ANNs for pattern recognition 
of traffic elements. 
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STE’ NO 


given 0/0 


ETT-^'TB 


give'' 0 : 


1 51 52 

ca 

g 1 ve^ or 


S~ 6 ’ 6 'j 

— ^ 

g 1 e* 0. z 


29525 } 

la 

g 1 V e ^ c / c 


S 

Z 3 

g 1 c 


a * 7" ~ 

ca 

g I r e - V r 


3 •; 31 3 £ 

ca 

give* c ^ 0 


4 2 ‘ 2 1 2 

ca 


r" 

^ - 1 0 . r 


g 1 5 - 0 " c 


7 2 

1 a 

gi'-r* 3 : 


^ : E ! S I 

r a 

gx .s- : 


57 ;'’6 : 

ca 

gi.r- c - 


£ »£ c £ c; 

r a 

g I e ^ : c 


£^4a4*£ 

c a 

give-' 0 ‘ 


' T * 

ca 

g 1 'w e " : c 


3- 5' 9 : 

ca 

giv-- •: ^ 



ca 


c 

'£0 rij 


gx .-T- 0 'X 


£ £ 

1 a 

gi'--r- ; : 


4' s: 

— a 

gi-e- r 



z a 

gi' -- ; 


Z w Z Z ^ j 

ca 

gx.T- x-x 


7 vi ^ ^ a J 

c a 

gi' r- : : 


- r T T ■ 

C 3 

gx.r- : ^ 


5. 52 

c a 

gi : 


'• J - ? 1 

z a 

g\.i- . : 



la 

gx.r- : : 


' ' L 4.: 

c a 

g X r- ■ : ; 


0 "w 

C 3 

g- ' ' 


- -T J r CT "" 

I a 

gxvT- : ■ 


r ^ 

ca 

: z 


- " r 

c a 

g:-4- ; “ 


— , v- 

r a 

gi T- : . 


- 3 ’ i 

la 


Iff 

— "‘It 




r - 


M ^ ^ 


Z if 

*-* ~ 

. 


H " r z -1 

c a 

gi V- z z 


c ^ ^ 

c a 

g ; T e - : r 


Z' c cr c cr ' 

z a 

g I - : t 


£■ 4 ; 

z a 

give- : *• 


» -y 

c a 

gi r' - 


7 j ;5 , 

Z 3 

g I T * z ' 

•T" 

' J 1 1' 


give- : - 


4- -r - -f 

w 1 1 ^ 

ca 

gi.e- «: r 



Ca 

give- o. D 


5 T IS 7 6 : 

c a 

give- 0 


265259 

ca 

given c> p 


544445 

ca 

given C/ n 


.«7i717 

ca 

given o/p 


312182 

ca 

given o/p 


431 313 

ca 


1 

o/p 

. 02508T 

error 

1 

o/p 

. 032321 

error 

- 

o/p 

055682 

A»-ror 

- 

o/p 

. 021 730 

e^ror 

> 

0 / p 

057831 

error 

j 

o/p 

0^241 0 

error 

( 

0 / p 

024325 

e^ror 

I 

a/Q 

1 

036009 

error 


o/p 

£- 43=5 

e'^ror 


0/0 

335744 

e *"* r 0 r 


o/p 

SS2468 

error 

• 

0 .' 0 

272160 

e^ror 

- 

c/p 

547621 

e^ror 


0 ''0 

4r:4 0£3 

error 

I 

0 / c 

304316 

e "* r 0 r 



415272 

a-por 


cz 



- 

C/C 

23T34S 

appor 

i 

G / 0 

407-32 

error 

1 

0 /"C 

576318 

e*' r or 

1 

0 / :n 

22501 6 

e ^ r c r 

- 

0 / c 

532047 

error 

1 

c / c 

480302 

es^ror 

- 

c / c 

322264 

a ^ T' Q p 

- 

c/ 0 

433618 

e'^rcr 

1 

D/C 

£ 7 «i U '-J £ 

error 

. 

1 ^ 

• r- 

394556 

error 

L 

Ov C 

564655 

error 

-I 

0 

2''S "'29 

— r 0 


c / p 

5-^3135 

error 

- 

D/C 

463691 

error 

.. 

0 / c 

31 0625 

error 


Z / c 

42E30o 



4 



1 


c30'^04 

* 0 r 


0 ' c 

404222 

error 

. 

c/ 0 

575493 

error 

. 


£22i?0 

error 


o/p 

549673 

er-or 


c.- p 

471455 

e^ror 

. 

0 p 

313477 

error 

* 

3 ‘"C 

43451 4 

error 

: 

G/7 

£77559 

error 

. 

0/ p 

401743 

error 


o/p 

573445 

error 

i, 

O/p 

.279799 

error 

i 

O/p 

544222 

error 

f 

o/p 

.467224 

error 

1 

o/p 

315586 

error 

- 

o/p 

430733 

error 


a/p 

■■ output. 



RESULTS OF SECOND TRAINING SET 

10 INPUT NODES, 90 HIDDEN MODES, 1 OUTPUT 


92 048836 X 
9a 21-454S X 

90 345894 X 
92 398361 X 
89 378059 X 

91 009430 X 
9£ 197815 X 
91 651291 X 

1 241 616/; 

7 083501 X 
4 213131 X 
4 791988 X 

- 583452 X 
1 630008 X 
4 357757 '/. 

3 719034 X 

S 

-3 624587 X 
1 919631 X ' 
078013 X 
-.754622 y. 

-3 233036 X 
-1 821 167 y 
-1 ,471637 y 
-1 693719 X 

1 181 824 X 

4 96 0893 y. 
a 100138 X 
3 522657 X 

231360 X 

1 701564 X 

2 374946 X 

1 392734 X 

-1 053336 y 

2 632666 */. 
221 091 y. 

1 1 15346 X 

- 960340 X 
055641 X 

- 092756 X 

- 742113 X 

.078707 % 

3 229309 X 
576011 X 

a 119941 X 
.040857 % 
.952439 X 
.815742 X 
.134483 X 


NODE 


R OftTTRRNS. ITERATIONS 
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STEP NO 0 


g i ven 

o/p 

0 . 

1 00000 

cel 

o/p 

0 

086453 

error 

13 

547249 

3C 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

086453 

error 

13 

547249 

y. 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

086453 

error 

13 

547249 

X 

given 

o/p 

0 

1 ooooo 

ca I 

o/p 

0 

088361 

error 

1 1 

639081 

y. 

given 

o/p 

0 

100000 

cal 

o/p 

0 

088361 

error 

1 1 

6.39 081 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

088361 

error 

1 1 

639081 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

086663 

error 

13, 

.336725 

X 

given 

o/p 

0 

1 ooooo 

ca I 

o/p 

0 

086663 

error 

13 

336725 

X 

given 

STEP 

o/p 

0 

100000 

1 

ca 1 

o/p 

0 

086663'" 

error 

13 

336725 

X 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

1 00££8 

error 

-0 

228412 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0. 

1 00888 

error 


228412 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 00888 

error 

-0 

228412 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 08472 

error 

-2 

472214 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 08472 

error 

-2, 

472214 

X 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

1 08472 

error 

-2 

47221 4 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 00395 

error 

-0 

395075 

X 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

1 00395 

error 

-0 

395075 

X 

given 

STEP 

o/p 

0 

1 00000 
a 

ca 1 

o/p 

0 

1 00395 

error 

-0 . 

395075 

X 

given 

o/p 

0 

1 00000 

ca 1 

o/p 

0 

099781 

error 

0 

219308 

X 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

099781 

error 

0 

21 9308 

X 

given 

o/p 

0 

1 ooooo 

ca 1 

o/p 

0 

099781 

error 

0 

219308 

X 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

101999 

error 

-1 

998790 

X 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

1 01 999 

error 

-1 

998790 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 01 999 

error 

-1 

998790 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

099918 

error 

0 

081681 

X 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

099918 

error 

0 

081 681 

X 

given 

STEP 

o/p 

0 

1 ooooo 

3 

cal 

o/p 

0 

099918 

error 

0 

081681 

X 

given 

o/p 

0 

1 ooooo 

cal 

o/ p 

0 

1 00886 

error 

-0 

286296 

X 

given 

o/p 

0 

1 ooooo 

ca 1 

o/p 

0 

1 00226 

error 

-0 

226296 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 00226 

error 

-0 

226296 

X 

given 

o/p 

0 

100000 

cal 

0/ p 

0 

1 02430 

error 

-2 

430417 

X 

given 

o/p 

0 

100000 

cal 

0 -'p 

0 

1 02430 

error 

-2 

430417 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 02430 

error 

-2 

43041 7 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 00315 

error 

-0 

314660 

X 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

1 00315 

error 

-0 

314660 

X 

given 

STEP 

o/p 

0 

1 ooooo 

4 

cal 

o/p 

0 

1 00315 

error 

-0 

314660 

X 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

099866 

error 

0 

1 34081 

X 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

099866 

error 

0 

1 34081 

X 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

099866 

error 

0 , 

. 1 34081 

X 

given 

o/p 

0 

1 ooooo 

ca 1 

o/ p 

0 

1 02046 

error 

-2 

046145 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

. 1 02046 

error 

-2 

.046145 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

. 1 02046 

error 

-8 

046145 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

. 099926 

error 

0 

074200 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

099926 

error 

0 

074200 

X 

given 

STEP 

o/p 

0 

* 1 ooooo 
s 

cal 

o/p 

0 

099926 

error 

0 

074200 

X 

given 

o/p 

0 

100000 

ca 1 

o/p 

0 

1 00224 

error 

-0 

224268 

X 

given 

o/p 

0 

1 ooooo 

ca 1 

o/p 

0 

1 00224 

error 

-0 

884868 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 00224 

error 

-0 

284262 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 02388 

error 

-2 

.387807 

X 

given 

o/p 

0 

1 ooooo 

ca 1 

o/p 

0 

1 02388 

error 

-8 

387807 

% 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

t 02388 

error 

-2 

387807 

X 

given 

o/p 

0 

1 ooooo 

cal 

o/p 

0 

1 00234 

error 

-0 

234477 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

,100234 

error 

-0 

234477 

X 

given 

o/p 

0 

100000 

cal 

o/p 

0 

1 00234 

error 

-0 

834477 

X 


a/p - OUTPUT 


RESULTS OF THIRD TRAINING SET 

64 INPUT NODES, 10O0 HIDDEN NODES, 3 OUTPUT NODES 
3 PATTERNa. laBB ITERATIONS. 
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CHAPTER 6 

SUMMARYXONCLUSIOij AND SCCfE FOR FUTURE WORK 

6 . 1 SUMMARY 

There are two main objectives of the present study : 

1. To have a deep insight into the field of ANNs (which are 

complementry to Artificial Intelligence) gaining sufficient 

knowledge about the principles involved therein and their 

working and also to study in depth the phenomena of Image 

Processing and Analog to Digital Conversion. Not only this, 

the purpose is to examine how the artificial neural networks 

can be used in association with a good Vision System to 

locate and identify 3D objects in the environment 

autonomously using a single and arbitrary view . Thus a 

knowledge of the present state of this art of 3D visual scene 

analysis using ANN has to be gained so as to apply this, in 

principle, to the field of (image processing in) traffic 
engineering . 

2. To attempt shape recognition of different types of 
vehicles seen in a 3D visual scene by training an ANN for 
this purpose. 

A detailed study has been conducted to fulfill the first 
objective and the details have been noted down in chapters 3 
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In order to fullfill the second objective, 2D pictures 
frames of the front views of the vehicles have been digitized 
(converted into numbers, the laguage of computers) as per a 
definite scheme (for details refer chap. 5). The digitized 
inputs have to be kept in the form of low order matrices 
because for high order matrices (256x256 or 512x512 ) memory 
as high as 100MB or even more is required which posed a 
serious restriction and must have affected the results 
adversely. Even with small matrices of the order of 8x8 ( i.e 
64 input nodes), the no. of hidden layer nodes couldn’t be 
increased beyond a certain value due to the memory 
restriction. A software using Artificial Neural Network 
theory has been developed wherein the algorithm used for 
training thr network is the Backpropagation algorithm. Tins 
network was trained with different input patterns (digitized 
inputs) and it has been found that the error in a good number 
of cases gets reduced to an acceptably low value which is a 
very encouraging result. The results of training for 
different patterns and after varying the parameters affecting 
convergence have been shown in the Chapter 5 . However , some 
cases also give a high percentage of error and convergence is 
not achieved even after a lot many iterations which may be 
due to : 

(i) low order of input matrices which has been kept due to 
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limitation of memory space. 

(ii) caveats of BP algorithm like local minima etc. 

(iii) network bias against such input patterns 
(refer article 3.4) 

6.2 CONCLUSIONS 

The successful training of the network by a number of 
classes of patterns, even with memory restriction, is a 
healthy sign towards future use of ANNs for image processing 
but it would be better to use special hardware for ANNs if 
better results are to be reached at because the large 
decrease in the order of input matrices owing to the memory 
restriction must have affected the results adversely. 

The network has a bias towards certain types of patterns 
as compared to other types which supports and is analogous to 
Darwin’s Theory of Natural Selection [refer Article 3.4 for 
details] . 

The time consumed for training is very large and an 
attempt can be made to make use of Genetic Algoritms to 
reduce the same . 

It can be concluded that neural networks are potentially 
of considerable value in the field of traffic engineering but 
a major problem in their use is the acquisition of large 
enough data sets for effective training and the high memory 
requirement during training besides the large CPU time needed 
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for training. 

6.3 SCOPE FOR FUTURE WORK 

The present work may be taken to mark an effective 
begining in the use of ANNs in association with Vision 
systems for 3D image processing in trafic engineering . There 
IS a large scope for following future work in this field of 
automatized image procesing for visual scene analysis: 

1. To work out dynamics of vehicles, 

2. Designing of automatized intersection control. 

3. Design of automatized accident warning systems; such a 
de\ ice may be fixed at the front and rear of each vehicle so 
as to give a warning signal, whenever the vehicle intrudes in 
the headway distance, as per relative speed of 
following/leading vehicle. 

4. Proper traffic monitoring at congested links 

5. Proper maintenance and repair of pavements. 

6. Study of driver behaviour and using the results for future 
studies . 

7. Counting of vehicles with a particular type of number 
plates thus replacing the manual counting. 

8. Detection of defects in vehicles which can save much time 
in comparison to manual detection. 

9. Automated vehicle dispatching with efficiency compared to 


that of an expert dispatcher. 
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